Overview

Dataset statistics

Number of variables8
Number of observations12838
Missing cells116
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory902.7 KiB
Average record size in memory72.0 B

Variable types

Numeric8

Alerts

observedTempMax is highly correlated with observedTempMin and 2 other fieldsHigh correlation
observedTempMin is highly correlated with observedTempMax and 3 other fieldsHigh correlation
observedHumidity is highly correlated with observedTempMin and 2 other fieldsHigh correlation
observedPressure is highly correlated with observedTempMax and 3 other fieldsHigh correlation
observedRainfall is highly correlated with observedTempMax and 3 other fieldsHigh correlation
observedTempMax is highly correlated with observedTempMin and 1 other fieldsHigh correlation
observedTempMin is highly correlated with observedTempMax and 3 other fieldsHigh correlation
observedHumidity is highly correlated with observedTempMin and 2 other fieldsHigh correlation
observedPressure is highly correlated with observedTempMax and 3 other fieldsHigh correlation
observedRainfall is highly correlated with observedTempMin and 2 other fieldsHigh correlation
observedTempMax is highly correlated with observedTempMinHigh correlation
observedTempMin is highly correlated with observedTempMax and 2 other fieldsHigh correlation
observedHumidity is highly correlated with observedRainfallHigh correlation
observedPressure is highly correlated with observedTempMin and 1 other fieldsHigh correlation
observedRainfall is highly correlated with observedTempMin and 2 other fieldsHigh correlation
stationID is highly correlated with observedWindHigh correlation
Days is highly correlated with observedTempMax and 5 other fieldsHigh correlation
observedTempMax is highly correlated with Days and 4 other fieldsHigh correlation
observedTempMin is highly correlated with Days and 4 other fieldsHigh correlation
observedHumidity is highly correlated with Days and 4 other fieldsHigh correlation
observedPressure is highly correlated with Days and 4 other fieldsHigh correlation
observedWind is highly correlated with stationID and 1 other fieldsHigh correlation
observedRainfall is highly correlated with Days and 4 other fieldsHigh correlation
observedRainfall has 1042 (8.1%) zeros Zeros

Reproduction

Analysis started2022-06-03 17:12:51.008812
Analysis finished2022-06-03 17:13:06.139221
Duration15.13 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

stationID
Real number (ℝ≥0)

HIGH CORRELATION

Distinct36
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15817.08599
Minimum10120
Maximum41977
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size200.6 KiB
2022-06-03T23:13:06.251563image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum10120
5-th percentile10208
Q111111
median11706
Q311929
95-th percentile41958
Maximum41977
Range31857
Interquartile range (IQR)818

Descriptive statistics

Standard deviation10755.02454
Coefficient of variation (CV)0.6799624497
Kurtosis2.05577409
Mean15817.08599
Median Absolute Deviation (MAD)390
Skewness2.008270185
Sum203059750
Variance115670552.9
MonotonicityIncreasing
2022-06-03T23:13:06.372958image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
11805366
 
2.9%
10208366
 
2.9%
11505366
 
2.9%
11513366
 
2.9%
11809366
 
2.9%
11921366
 
2.9%
11929366
 
2.9%
10120366
 
2.9%
10320366
 
2.9%
11925366
 
2.9%
Other values (26)9178
71.5%
ValueCountFrequency (%)
10120366
2.9%
10208366
2.9%
10320366
2.9%
10408366
2.9%
10609366
2.9%
10705366
2.9%
10724366
2.9%
10910366
2.9%
11111366
2.9%
11313366
2.9%
ValueCountFrequency (%)
41977366
2.9%
41958366
2.9%
4194728
 
0.2%
41926366
2.9%
41909366
2.9%
41858366
2.9%
12110366
2.9%
12103366
2.9%
12007366
2.9%
11929366
2.9%

Days
Real number (ℝ≥0)

HIGH CORRELATION

Distinct366
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean183.1314068
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size200.6 KiB
2022-06-03T23:13:06.504487image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile18
Q191
median183
Q3275
95-th percentile348
Maximum366
Range365
Interquartile range (IQR)184

Descriptive statistics

Standard deviation105.8382791
Coefficient of variation (CV)0.5779362533
Kurtosis-1.201771358
Mean183.1314068
Median Absolute Deviation (MAD)92
Skewness0.00150696532
Sum2351041
Variance11201.74132
MonotonicityNot monotonic
2022-06-03T23:13:06.727263image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
836
 
0.3%
136
 
0.3%
1736
 
0.3%
2536
 
0.3%
236
 
0.3%
1036
 
0.3%
1836
 
0.3%
2636
 
0.3%
336
 
0.3%
1136
 
0.3%
Other values (356)12478
97.2%
ValueCountFrequency (%)
136
0.3%
236
0.3%
336
0.3%
436
0.3%
536
0.3%
636
0.3%
736
0.3%
836
0.3%
936
0.3%
1036
0.3%
ValueCountFrequency (%)
36635
0.3%
36535
0.3%
36435
0.3%
36335
0.3%
36235
0.3%
36135
0.3%
36035
0.3%
35935
0.3%
35835
0.3%
35735
0.3%

observedTempMax
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9330
Distinct (%)72.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.63729508
Minimum21.42068966
Maximum36.85483871
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size200.6 KiB
2022-06-03T23:13:06.874475image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum21.42068966
5-th percentile25.08679487
Q129.1493956
median31.47266214
Q332.44566636
95-th percentile34.09111888
Maximum36.85483871
Range15.43414905
Interquartile range (IQR)3.296270752

Descriptive statistics

Standard deviation2.752837476
Coefficient of variation (CV)0.08985249735
Kurtosis0.06089005398
Mean30.63729508
Median Absolute Deviation (MAD)1.252405732
Skewness-0.8403224256
Sum393321.5943
Variance7.578114167
MonotonicityNot monotonic
2022-06-03T23:13:07.005991image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31.5487179510
 
0.1%
30.856410269
 
0.1%
32.49
 
0.1%
31.851282059
 
0.1%
31.856410269
 
0.1%
32.29743599
 
0.1%
32.405128219
 
0.1%
31.723076929
 
0.1%
32.228205138
 
0.1%
32.230769238
 
0.1%
Other values (9320)12749
99.3%
ValueCountFrequency (%)
21.420689661
< 0.1%
21.917948721
< 0.1%
21.962068971
< 0.1%
21.996551721
< 0.1%
22.030769231
< 0.1%
22.072413791
< 0.1%
22.092307691
< 0.1%
22.105128211
< 0.1%
22.115384621
< 0.1%
22.124137931
< 0.1%
ValueCountFrequency (%)
36.854838711
< 0.1%
36.841935481
< 0.1%
36.780645162
< 0.1%
36.753846151
< 0.1%
36.674193551
< 0.1%
36.594871791
< 0.1%
36.556410261
< 0.1%
36.551282051
< 0.1%
36.538461542
< 0.1%
36.535483871
< 0.1%

observedTempMin
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9535
Distinct (%)74.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.32676368
Minimum9.252631579
Maximum27.17096774
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size200.6 KiB
2022-06-03T23:13:07.159721image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum9.252631579
5-th percentile12.15897436
Q117.04679487
median23.53076923
Q325.51702381
95-th percentile26.26451613
Maximum27.17096774
Range17.91833616
Interquartile range (IQR)8.470228938

Descriptive statistics

Standard deviation4.904167693
Coefficient of variation (CV)0.2299536754
Kurtosis-0.9072668793
Mean21.32676368
Median Absolute Deviation (MAD)2.487179487
Skewness-0.723814616
Sum273792.9921
Variance24.05086076
MonotonicityNot monotonic
2022-06-03T23:13:07.291241image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
26.117
 
0.1%
25.7307692311
 
0.1%
2610
 
0.1%
26.2538461510
 
0.1%
25.9666666710
 
0.1%
26.0487179510
 
0.1%
25.510
 
0.1%
26.0615384610
 
0.1%
25.9102564110
 
0.1%
25.69
 
0.1%
Other values (9525)12731
99.2%
ValueCountFrequency (%)
9.2526315791
< 0.1%
9.38751
< 0.1%
9.4868421051
< 0.1%
9.5564102561
< 0.1%
9.558620691
< 0.1%
9.6294117651
< 0.1%
9.6342105261
< 0.1%
9.6657894741
< 0.1%
9.6794871791
< 0.1%
9.6794871791
< 0.1%
ValueCountFrequency (%)
27.170967741
< 0.1%
27.096774191
< 0.1%
27.015384621
< 0.1%
26.987179491
< 0.1%
26.967741941
< 0.1%
26.964516131
< 0.1%
26.951282051
< 0.1%
26.917948721
< 0.1%
26.877419351
< 0.1%
26.876923081
< 0.1%

observedHumidity
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3781
Distinct (%)29.5%
Missing30
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean79.9061145
Minimum56.69230769
Maximum97
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size200.6 KiB
2022-06-03T23:13:07.444962image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum56.69230769
5-th percentile68.43113553
Q176.1025641
median80.41025641
Q384.8974359
95-th percentile88.35897436
Maximum97
Range40.30769231
Interquartile range (IQR)8.794871795

Descriptive statistics

Standard deviation6.13144073
Coefficient of variation (CV)0.07673306065
Kurtosis-0.02732654053
Mean79.9061145
Median Absolute Deviation (MAD)4.41025641
Skewness-0.6059856807
Sum1023437.514
Variance37.59456542
MonotonicityNot monotonic
2022-06-03T23:13:07.594314image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8551
 
0.4%
8051
 
0.4%
7950
 
0.4%
7747
 
0.4%
8144
 
0.3%
8241
 
0.3%
7837
 
0.3%
8436
 
0.3%
7636
 
0.3%
8633
 
0.3%
Other values (3771)12382
96.4%
ValueCountFrequency (%)
56.692307691
< 0.1%
56.948717951
< 0.1%
57.153846151
< 0.1%
58.153846151
< 0.1%
58.307692311
< 0.1%
58.435897441
< 0.1%
58.923076921
< 0.1%
58.974358971
< 0.1%
59.025641032
< 0.1%
59.128205131
< 0.1%
ValueCountFrequency (%)
971
 
< 0.1%
962
 
< 0.1%
954
< 0.1%
946
< 0.1%
933
< 0.1%
927
0.1%
91.666666671
 
< 0.1%
91.564102561
 
< 0.1%
91.473684211
 
< 0.1%
91.358974361
 
< 0.1%

observedPressure
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9485
Distinct (%)74.0%
Missing28
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean1008.018912
Minimum974.8538462
Maximum1039.406897
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size200.6 KiB
2022-06-03T23:13:07.761155image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum974.8538462
5-th percentile1000.272073
Q11003.298077
median1008.169231
Q31012.775827
95-th percentile1015.285985
Maximum1039.406897
Range64.5530504
Interquartile range (IQR)9.477749663

Descriptive statistics

Standard deviation5.452238648
Coefficient of variation (CV)0.005408865431
Kurtosis1.548732333
Mean1008.018912
Median Absolute Deviation (MAD)4.730952381
Skewness-0.5569541696
Sum12912722.27
Variance29.72690627
MonotonicityNot monotonic
2022-06-03T23:13:07.892698image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
101211
 
0.1%
1000.411
 
0.1%
1015.311
 
0.1%
1015.18
 
0.1%
1012.78
 
0.1%
1004.88
 
0.1%
1003.58
 
0.1%
1007.98
 
0.1%
1003.0820517
 
0.1%
1014.97
 
0.1%
Other values (9475)12723
99.1%
(Missing)28
 
0.2%
ValueCountFrequency (%)
974.85384621
< 0.1%
975.85384621
< 0.1%
976.98974361
< 0.1%
9791
< 0.1%
979.31379311
< 0.1%
979.4620691
< 0.1%
979.57931031
< 0.1%
979.72068971
< 0.1%
979.81724141
< 0.1%
979.84482761
< 0.1%
ValueCountFrequency (%)
1039.4068971
< 0.1%
1038.9137931
< 0.1%
1037.2121211
< 0.1%
1034.1736841
< 0.1%
1033.1461541
< 0.1%
1029.3710531
< 0.1%
1023.2551721
< 0.1%
1018.82
< 0.1%
1018.4307691
< 0.1%
1018.21
< 0.1%

observedWind
Real number (ℝ≥0)

HIGH CORRELATION

Distinct6503
Distinct (%)50.8%
Missing30
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean3.202190545
Minimum0
Maximum12.2
Zeros4
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size200.6 KiB
2022-06-03T23:13:08.046452image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.607692308
Q12.279411765
median2.984615385
Q33.77742915
95-th percentile5.664102564
Maximum12.2
Range12.2
Interquartile range (IQR)1.498017385

Descriptive statistics

Standard deviation1.336709934
Coefficient of variation (CV)0.4174361005
Kurtosis4.06988502
Mean3.202190545
Median Absolute Deviation (MAD)0.7435897436
Skewness1.591977538
Sum41013.65651
Variance1.786793449
MonotonicityNot monotonic
2022-06-03T23:13:08.195551image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
282
 
0.6%
348
 
0.4%
439
 
0.3%
533
 
0.3%
2.532
 
0.2%
2.320
 
0.2%
3.320
 
0.2%
3.520
 
0.2%
2.218
 
0.1%
3.218
 
0.1%
Other values (6493)12478
97.2%
(Missing)30
 
0.2%
ValueCountFrequency (%)
04
< 0.1%
0.28571428571
 
< 0.1%
0.58205128211
 
< 0.1%
0.58461538461
 
< 0.1%
0.75526315791
 
< 0.1%
0.77027027031
 
< 0.1%
0.84594594591
 
< 0.1%
0.84615384621
 
< 0.1%
0.84736842111
 
< 0.1%
0.85945945951
 
< 0.1%
ValueCountFrequency (%)
12.21
< 0.1%
11.81
< 0.1%
10.41
< 0.1%
10.352941181
< 0.1%
10.11
< 0.1%
10.085294121
< 0.1%
10.047058821
< 0.1%
10.029411761
< 0.1%
10.017647061
< 0.1%
101
< 0.1%

observedRainfall
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct3351
Distinct (%)26.2%
Missing28
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean6.70611249
Minimum0
Maximum57.1025641
Zeros1042
Zeros (%)8.1%
Negative0
Negative (%)0.0%
Memory size200.6 KiB
2022-06-03T23:13:08.431455image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.5128205128
median4.124708625
Q310.56410256
95-th percentile22.03974359
Maximum57.1025641
Range57.1025641
Interquartile range (IQR)10.05128205

Descriptive statistics

Standard deviation7.661600358
Coefficient of variation (CV)1.142480143
Kurtosis3.269712888
Mean6.70611249
Median Absolute Deviation (MAD)3.970862471
Skewness1.620469935
Sum85905.30099
Variance58.70012004
MonotonicityNot monotonic
2022-06-03T23:13:08.547363image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01042
 
8.1%
0.02564102564180
 
1.4%
0.05128205128145
 
1.1%
0.07692307692110
 
0.9%
0.102564102697
 
0.8%
0.179487179580
 
0.6%
0.153846153876
 
0.6%
0.333333333371
 
0.6%
0.230769230865
 
0.5%
0.128205128264
 
0.5%
Other values (3341)10880
84.7%
ValueCountFrequency (%)
01042
8.1%
0.012820512821
 
< 0.1%
0.02564102564180
 
1.4%
0.0263157894730
 
0.2%
0.027027027034
 
< 0.1%
0.027777777786
 
< 0.1%
0.0285714285712
 
0.1%
0.029411764716
 
< 0.1%
0.030303030310
 
0.1%
0.032258064528
 
0.1%
ValueCountFrequency (%)
57.10256411
< 0.1%
51.128205131
< 0.1%
50.923076921
< 0.1%
50.742857141
< 0.1%
49.205128211
< 0.1%
49.006410261
< 0.1%
48.615384621
< 0.1%
47.974358971
< 0.1%
47.485714291
< 0.1%
47.256410261
< 0.1%

Interactions

2022-06-03T23:13:04.051637image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:54.954631image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:56.286118image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:57.552058image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:58.898628image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:00.192898image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:01.576638image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:02.828184image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:04.201422image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:55.122872image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:56.435603image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:57.709811image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:59.057760image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:00.352999image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:01.730184image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:02.980659image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:04.422328image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:55.286019image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:56.598753image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:57.870352image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:59.221891image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:00.517323image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:01.885876image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:03.139509image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:04.584147image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:55.486284image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:56.759107image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:58.030853image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:59.384008image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:00.681702image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:02.046238image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:03.299273image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:04.738297image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:55.659851image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:56.921472image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:58.195691image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:59.549171image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:00.850055image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:02.206654image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:03.455422image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:04.898227image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:55.805792image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:57.090193image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:58.440838image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:59.720622image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:01.021456image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:02.371583image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:03.614628image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:05.047734image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:55.981351image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:57.248502image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:58.593553image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:59.877970image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:01.190316image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:02.528802image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:03.765064image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:05.191911image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:56.136556image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:57.400910image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:12:58.746068image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:00.039968image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:01.415785image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:02.680644image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-06-03T23:13:03.911787image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2022-06-03T23:13:08.679167image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-06-03T23:13:08.894900image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-06-03T23:13:09.095453image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-06-03T23:13:09.295902image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-06-03T23:13:05.424819image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2022-06-03T23:13:05.670531image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-06-03T23:13:05.883470image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-06-03T23:13:06.026779image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

stationIDDaysobservedTempMaxobservedTempMinobservedHumidityobservedPressureobservedWindobservedRainfall
010120123.18205110.73589777.5862071016.2551723.9689660.034483
110120222.99743610.73846278.6551721016.4241384.0172410.310345
210120322.27692310.56666780.5517241016.2275864.3068970.448276
310120422.11538510.71282179.8620691016.0275864.2344830.758621
410120522.32564110.27435979.7586211015.6275864.4344830.034483
510120622.64871810.02820579.0000001015.7655173.4103450.000000
610120722.51282110.31025679.6551721016.1413793.9724140.000000
710120822.53333310.18205178.9310341015.8275863.8068970.827586
810120922.09230810.49487280.0344831015.7862073.6793100.655172
9101201021.9179499.89230879.4137931015.9034483.5586210.000000

Last rows

stationIDDaysobservedTempMaxobservedTempMinobservedHumidityobservedPressureobservedWindobservedRainfall
128284197735726.90476215.147619NaNNaNNaNNaN
128294197735827.40952415.380952NaNNaNNaNNaN
128304197735927.30476215.228571NaNNaNNaNNaN
128314197736027.10000014.938095NaNNaNNaNNaN
128324197736126.73333315.042857NaNNaNNaNNaN
128334197736226.63333314.795238NaNNaNNaNNaN
128344197736327.04761914.319048NaNNaNNaNNaN
128354197736427.41428614.376190NaNNaNNaNNaN
128364197736527.31428614.476190NaNNaNNaNNaN
128374197736626.90000013.620000NaNNaNNaNNaN